Entity-Focused Sentence Simplification for Relation Extraction

نویسندگان

  • Makoto Miwa
  • Rune Sætre
  • Yusuke Miyao
  • Jun'ichi Tsujii
چکیده

Relations between entities in text have been widely researched in the natural language processing and informationextraction communities. The region connecting a pair of entities (in a parsed sentence) is often used to construct kernels or feature vectors that can recognize and extract interesting relations. Such regions are useful, but they can also incorporate unnecessary distracting information. In this paper, we propose a rulebased method to remove the information that is unnecessary for relation extraction. Protein–protein interaction (PPI) is used as an example relation extraction problem. A dozen simple rules are defined on output from a deep parser. Each rule specifically examines the entities in one target interaction pair. These simple rules were tested using several PPI corpora. The PPI extraction performance was improved on all the PPI corpora.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency

Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in t...

متن کامل

Simplification de phrases pour l'extraction de relations (Sentence Simplification for Relation Extraction) [in French]

Sentence simplification for relation extraction Machine learning based relation extraction requires large annotated corpora to take into account the variability in the expression of relations. To deal with this problem, we propose a method for simplifying sentences, i.e. for reducing the syntactic variability of the relations. Simplification requires the annotation of a small corpus, which will...

متن کامل

A Sentence Simplification System for Improving Relation Extraction

In this demo paper, we present a text simplification approach that is directed at improving the performance of state-of-the-art Open Relation Extraction (RE) systems. As syntactically complex sentences often pose a challenge for current Open RE approaches, we have developed a simplification framework that performs a pre-processing step by taking a single sentence as input and using a set of syn...

متن کامل

Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination

Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...

متن کامل

Enhancing the Interoperability of iSimp by Using the BioC Format

*Corresponding author: Tel: 302 831 8496, E-mail: [email protected] ! Abstract This paper reports the use of the BioC format in our sentence simplification system, iSimp, so that it could be seamlessly used in text mining pipelines. iSimp is designed to simplify complex sentences commonly found in the biomedical text, therefore bringing benefits to existing text mining applications that rely on t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010